Instance selection of linear complexity for big data

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Instance selection of linear complexity for big data

Over recent decades, database sizes have grown considerably. Larger sizes present new challenges, because machine learning algorithms are not prepared to process such large volumes of information. Instance selection methods can alleviate this problem when the size of the data set is medium to large. However, even these methods face similar problems with very large-to-massive data sets. In this ...

متن کامل

A Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection

Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change. The popularity of big data analytics platforms, which are often available as open-source, has not remained unnoticed by big companies. Google uses MapReduce for PageRank and inverted indexes....

متن کامل

CoSelect: Feature Selection with Instance Selection for Social Media Data

Feature selection is widely used in preparing highdimensional data for effective data mining. Attributevalue data in traditional feature selection differs from social media data, although both can be large-scale. Social media data is inherently not independent and identically distributed (i.i.d.), but linked. Furthermore, there is a lot of noise. The quality of social media data can vary drasti...

متن کامل

An Architecture for Security and Protection of Big Data

The issue of online privacy and security is a challenging subject, as it concerns the privacy of data that are increasingly more accessible via the internet. In other words, people who intend to access the private information of other users can do so more efficiently over the internet. This study is an attempt to address the privacy issue of distributed big data in the context of cloud computin...

متن کامل

Massively-Parallel Feature Selection for Big Data

We present the Parallel, Forward-Backward with Pruning (PFBP) algorithm for feature selection (FS) in Big Data settings (high dimensionality and/or sample size). To tackle the challenges of Big Data FS PFBP partitions the data matrix both in terms of rows (samples, training examples) as well as columns (features). By employing the concepts of p-values of conditional independence tests and meta-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Knowledge-Based Systems

سال: 2016

ISSN: 0950-7051

DOI: 10.1016/j.knosys.2016.05.056